这项工作提出了一种用于实验颗粒物理学的域通知的神经网络架构,其使用与时引起室(TPC)技术的粒子相互作用定位作为暗物质研究作为示例应用。 TPC内产生的信号的关键特征是它们允许通过称为重建的过程定位粒子相互作用。虽然多层的感知者(MLPS)被出现为TPC中重建的主要竞争者,但这种黑箱方法不反映出潜在的科学进程的先验知识。本文在基于神经网络的交互本地化的重点看,并根据信号特性和检测器几何形状来编码先前的检测器知识,进入多层神经网络的特征编码和输出层。所得到的域通知的神经网络(DINN限制了初始特征编码层中神经元的接收领域,以便考虑TPC内产生的信号的空间局部性质。DINN的这一方面具有相似之处图形神经网络的新出现区域,因为初始层中的神经元在其后续层中仅连接到少数神经元,与MLP相比,显着降低了网络中的参数的数量。此外,为了解释探测器几何形状,网络的输出层使用两个几何变换来修改,以确保Dinn在检测器内部产生本地化。最终结果是一个神经网络架构,参数比MLP更少60%,但仍然达到类似的本地化性能,并为未来的架构开发提供了一种改进性能的路径,因为它们能够ENC的能力odes附加域名知识进入架构。
translated by 谷歌翻译
给定有限数量的训练数据样本的分类的基本任务被考虑了具有已知参数统计模型的物理系统。基于独立的学习和统计模型的分类器面临使用小型训练集实现分类任务的主要挑战。具体地,单独依赖基于物理的统计模型的分类器通常遭受它们无法适当地调整底层的不可观察的参数,这导致系统行为的不匹配表示。另一方面,基于学习的分类器通常依赖于来自底层物理过程的大量培训数据,这在最实际的情况下可能不可行。本文提出了一种混合分类方法 - 被称为亚牙线的菌丝 - 利用基于物理的统计模型和基于学习的分类器。所提出的解决方案基于猜想,即通过融合它们各自的优势,刺鼠线将减轻与基于学习和统计模型的分类器的各个方法相关的挑战。所提出的混合方法首先使用可用(次优)统计估计程序来估计不可观察的模型参数,随后使用基于物理的统计模型来生成合成数据。然后,培训数据样本与基于学习的分类器中的合成数据结合到基于神经网络的域 - 对抗训练。具体地,为了解决不匹配问题,分类器将从训练数据和合成数据的映射学习到公共特征空间。同时,培训分类器以在该空间内找到判别特征,以满足分类任务。
translated by 谷歌翻译
主成分分析(PCA)是大数据时代的维度减少的Workhorse工具。虽然经常被忽视,但PCA的目的不仅可以减少数据维度,而且还要产生不相关的功能。此外,现代世界中不断增加的数据量通常需要在多台机器上存储数据样本,这会排除使用集中式PCA算法。本文重点介绍了PCA的双重目标,即功能的维度和特征的脱钩,但在分布式环境中。这需要估计数据协方差矩阵的特征向量,而不是仅估计特征向量跨越的子空间,当数据分布在机器网络上时。尽管最近已经提出了几种分布式PCA问题的分布式解决方案,但这些解决方案的收敛保证和/或通信开销仍然是一个问题。随着通信效率的眼睛,介绍了一种基于前馈神经网络的一种时级分布式PCA算法,其被称为分布式Sanger的算法(DSA),该算法(DSA)估计数据协方差矩阵的特征向量,当数据分布在一个无向连接的网络上时机器。此外,所提出的算法被示出为线性地收敛到真实解决方案的邻域。还提供了数值结果以证明所提出的解决方案的功效。
translated by 谷歌翻译
机器学习已开始在许多应用中发挥核心作用。这些应用程序中的许多应用程序通常还涉及由于设计约束(例如多元系统)或计算/隐私原因(例如,在智能手机数据上学习),这些数据集分布在多个计算设备/机器上。这样的应用程序通常需要以分散的方式执行学习任务,其中没有直接连接到所有节点的中央服务器。在现实世界中的分散设置中,由于设备故障,网络攻击等,节点容易出现未发现的故障,这可能会崩溃非稳固的学习算法。本文的重点是在发生拜占庭失败的节点的存在下对分散学习的鲁棒化。拜占庭故障模型允许故障节点任意偏离其预期行为,从而确保设计最健壮的算法的设计。但是,与分布式学习相反,对分散学习中拜占庭式的弹性的研究仍处于起步阶段。特别是,现有的拜占庭式分散学习方法要么不能很好地扩展到大规模的机器学习模型,要么缺乏统计收敛性可确保有助于表征其概括错误。在本文中,引入了一个可扩展的,拜占庭式的分散的机器学习框架,称为拜占庭的分散梯度下降(桥梁)。本文中还提供了强烈凸出问题和一类非凸问题的算法和统计收敛保证。此外,使用大规模的分散学习实验来确定桥梁框架是可扩展的,并且为拜占庭式弹性凸和非convex学习提供了竞争结果。
translated by 谷歌翻译
A comprehensive pharmaceutical recommendation system was designed based on the patients and drugs features extracted from Drugs.com and Druglib.com. First, data from these databases were combined, and a dataset of patients and drug information was built. Secondly, the patients and drugs were clustered, and then the recommendation was performed using different ratings provided by patients, and importantly by the knowledge obtained from patients and drug specifications, and considering drug interactions. To the best of our knowledge, we are the first group to consider patients conditions and history in the proposed approach for selecting a specific medicine appropriate for that particular user. Our approach applies artificial intelligence (AI) models for the implementation. Sentiment analysis using natural language processing approaches is employed in pre-processing along with neural network-based methods and recommender system algorithms for modeling the system. In our work, patients conditions and drugs features are used for making two models based on matrix factorization. Then we used drug interaction to filter drugs with severe or mild interactions with other drugs. We developed a deep learning model for recommending drugs by using data from 2304 patients as a training set, and then we used data from 660 patients as our validation set. After that, we used knowledge from critical information about drugs and combined the outcome of the model into a knowledge-based system with the rules obtained from constraints on taking medicine.
translated by 谷歌翻译
How can we accurately identify new memory workloads while classifying known memory workloads? Verifying DRAM (Dynamic Random Access Memory) using various workloads is an important task to guarantee the quality of DRAM. A crucial component in the process is open-set recognition which aims to detect new workloads not seen in the training phase. Despite its importance, however, existing open-set recognition methods are unsatisfactory in terms of accuracy since they fail to exploit the characteristics of workload sequences. In this paper, we propose Acorn, an accurate open-set recognition method capturing the characteristics of workload sequences. Acorn extracts two types of feature vectors to capture sequential patterns and spatial locality patterns in memory access. Acorn then uses the feature vectors to accurately classify a subsequence into one of the known classes or identify it as the unknown class. Experiments show that Acorn achieves state-of-the-art accuracy, giving up to 37% points higher unknown class detection accuracy while achieving comparable known class classification accuracy than existing methods.
translated by 谷歌翻译
Data heterogeneity across clients is a key challenge in federated learning. Prior works address this by either aligning client and server models or using control variates to correct client model drift. Although these methods achieve fast convergence in convex or simple non-convex problems, the performance in over-parameterized models such as deep neural networks is lacking. In this paper, we first revisit the widely used FedAvg algorithm in a deep neural network to understand how data heterogeneity influences the gradient updates across the neural network layers. We observe that while the feature extraction layers are learned efficiently by FedAvg, the substantial diversity of the final classification layers across clients impedes the performance. Motivated by this, we propose to correct model drift by variance reduction only on the final layers. We demonstrate that this significantly outperforms existing benchmarks at a similar or lower communication cost. We furthermore provide proof for the convergence rate of our algorithm.
translated by 谷歌翻译
Supervised machine learning-based medical image computing applications necessitate expert label curation, while unlabelled image data might be relatively abundant. Active learning methods aim to prioritise a subset of available image data for expert annotation, for label-efficient model training. We develop a controller neural network that measures priority of images in a sequence of batches, as in batch-mode active learning, for multi-class segmentation tasks. The controller is optimised by rewarding positive task-specific performance gain, within a Markov decision process (MDP) environment that also optimises the task predictor. In this work, the task predictor is a segmentation network. A meta-reinforcement learning algorithm is proposed with multiple MDPs, such that the pre-trained controller can be adapted to a new MDP that contains data from different institutes and/or requires segmentation of different organs or structures within the abdomen. We present experimental results using multiple CT datasets from more than one thousand patients, with segmentation tasks of nine different abdominal organs, to demonstrate the efficacy of the learnt prioritisation controller function and its cross-institute and cross-organ adaptability. We show that the proposed adaptable prioritisation metric yields converging segmentation accuracy for the novel class of kidney, unseen in training, using between approximately 40\% to 60\% of labels otherwise required with other heuristic or random prioritisation metrics. For clinical datasets of limited size, the proposed adaptable prioritisation offers a performance improvement of 22.6\% and 10.2\% in Dice score, for tasks of kidney and liver vessel segmentation, respectively, compared to random prioritisation and alternative active sampling strategies.
translated by 谷歌翻译
The primary obstacle to developing technologies for low-resource languages is the lack of representative, usable data. In this paper, we report the deployment of technology-driven data collection methods for creating a corpus of more than 60,000 translations from Hindi to Gondi, a low-resource vulnerable language spoken by around 2.3 million tribal people in south and central India. During this process, we help expand information access in Gondi across 2 different dimensions (a) The creation of linguistic resources that can be used by the community, such as a dictionary, children's stories, Gondi translations from multiple sources and an Interactive Voice Response (IVR) based mass awareness platform; (b) Enabling its use in the digital domain by developing a Hindi-Gondi machine translation model, which is compressed by nearly 4 times to enable it's edge deployment on low-resource edge devices and in areas of little to no internet connectivity. We also present preliminary evaluations of utilizing the developed machine translation model to provide assistance to volunteers who are involved in collecting more data for the target language. Through these interventions, we not only created a refined and evaluated corpus of 26,240 Hindi-Gondi translations that was used for building the translation model but also engaged nearly 850 community members who can help take Gondi onto the internet.
translated by 谷歌翻译
Large language models (LLMs) have been shown to be able to perform new tasks based on a few demonstrations or natural language instructions. While these capabilities have led to widespread adoption, most LLMs are developed by resource-rich organizations and are frequently kept from the public. As a step towards democratizing this powerful technology, we present BLOOM, a 176B-parameter open-access language model designed and built thanks to a collaboration of hundreds of researchers. BLOOM is a decoder-only Transformer language model that was trained on the ROOTS corpus, a dataset comprising hundreds of sources in 46 natural and 13 programming languages (59 in total). We find that BLOOM achieves competitive performance on a wide variety of benchmarks, with stronger results after undergoing multitask prompted finetuning. To facilitate future research and applications using LLMs, we publicly release our models and code under the Responsible AI License.
translated by 谷歌翻译